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Abstract 
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1 Introduction 

Two standard results for the localization of all or some of the zeros of a polynomial, due 
to Cauchy and Pellet, respectively, are given by 

Theorem 1 . 1 . (Cauchy’s theorem - original scalar version) |2l Th.(27,l), 

p.l22 and Exercise 1, p.l26]) All the zeros of the polynomial p{z) = z"’ -|- • • • -|- 

aiz + oq with complex coefficients, n > 2, lie in \z\ < s, where s is the unique positive 
solution of 

x" — \an-i\x^~^ — • • • — |ai|x — |ao| = 0 . 

Theorem 1 . 2 . (Pellet’s theorem - original scalar version) (JT^. |2l Th.(28,l), 
p.l28]) Given the polynomial p{z) = z'^+an-iz^~^+- ■ --l-aiz-l-ao with complex coefficients, 
Ofc 7 ^ 0, and n>2, let 1 < k < n — 1, and let the polynomial 

x"' -b |an-i|x”“^ H-h |afc+i|x^+^ - \ak\x^ + \ak-i\x'^~^ H-h |ao| 

have two distinct positive roots xi and X 2 with xi < X 2 - Then p has exactly k zeros in or 
on the circle \z\ = xi and no zeros in the annular ring xi < \z\ < X 2 - 

Theorem 11.11 provides an upper bound on the moduli of the zeros, whereas Theo¬ 
rem O sometimes allows zeros to be separated into two different groups, according to the 
magnitude of their moduli. However, the latter is very sensitive to the magnitude of the 
coefficients and for the theorem to be applicable, one or more coefficients typically have 
to be much larger than the others. 

The inequalities for the moduli of the zeros in these theorems are sharp in the sense 
that there exist polynomials for which they hold as equalities. Our aim is nevertheless to 
improve Theorem 11.11 and a few special cases of Theorem 11.21 resulting in a number of 
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results of a similar nature, i.e., also involving the solution of one or two real equations. 
We immediately point out that the solution of such equations requires a negligible compu¬ 
tational effort compared to the computation of the actual zeros (see, e.g., m, IS]), and 
we will not dwell on it. Theorems o and Ol have many applications and are often used 
to find good starting points for iterative methods that compute some or all of the zeros. 

There are several ways to derive these and many other results related to polynomial 
zeros, one of which is to use linear algebra arguments. Although not necessarily producing 
the shortest proofs, this provides a transparent and often elegant treatment of such results. 
On the other hand, a linear algebra approach does not generally seem to lead to results 
that cannot also be obtained by purely algebraic manipulation or applications of complex 
analysis, an observation also made in HI p.263]. Here, in contrast, we will use linear 
algebra tools to derive results that, it appears, cannot easily be obtained otherwise. 

Before going into more detail, we recall that the zeros of the complex monic scalar 
polynomial p{z) = oq are the eigenvalues of the n x n companion 

matrix C{p), defined by 


C{p) 


/O 

1 


-ao \ 

—di 


\ 1 -On-lJ 


( 1 ) 


Blank spaces in the matrices indicate zero elements. Thus, locating the eigenvalues of 
C{p) is equivalent to locating the zeros of p. We will make frequent use of Gershgorin’s 
theorem, which provides inclusion regions for the eigenvalues of a matrix. It is stated next. 

Theorem 1.3. (Gershgorin’s theorem) Section 6.1]) All the eigenvalues of 

the nxn complex matrix A with elements aij and deleted row sums R[{A) = X)7=i l®*jl 

located in the union of n discs O {an] R^{A)). If k discs are disjoint from the other 
discs, then their union contains exactly k eigenvalues. 

Because the eigenvalues of a matrix A are the same as those of its transpose A^, an 
analogous version is obtained by interchanging rows and columns. We refer to these as 
the row and column versions of the theorem and to the eigenvalue inclusion regions as 
the Gershgorin row and column sets, respectively. A good in-depth exposition of this 
theorem and many related theorems can be found in [16]. As an illustration, let us apply 
the column version to C(j>): all the zeros of the polynomial p can be found in the union 
of the disc, centered at the origin with radius one and the dick, centered at —On-i, whose 
radius is the nth deleted column sum. It is often useful to apply an appropriate similarity 
transformation to the matrix, which does not change the eigenvalues, although it does 
affect the Gershgorin set. 

For example. Theorem 11.11 can be obtained by applying Gershgorin’s theorem to a 
specific diagonal similarity transformation of the companion matrix and m 

Theorem 8.6.3]), and the same is true for related results 1|10]). We propose to improve 
the aforementioned results and derive additional ones by considering the square of the 
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companion matrix, the eigenvalues of which are the squares of the zeros of p. It is given 
by 


C\p) 


/O 0 
0 0 
1 0 


0 —ao ttn-iao ^ 

0 (X\ (Xyi—\(X\ (Xq 

0 —02 an—lCL2 ~ CLl 


( 2 ) 


\0 0 


1 1 1 2 / 


The idea of obtaining additional inclusion regions for the eigenvalues of a general matrix 
by squaring it is certainly not new, but it does not typically lead to an improvement (for 
some examples see, e.g., M)- However, there are good reasons to use C‘^{p) instead of 
C{p). First of all, it can be shown ([9l Theorem 2.1]) that squaring does lead to smaller 
inclusion regions when the matrix has a zero diagonal, and that is almost the case for a 
companion matrix: only one diagonal element is not necessarily zero. Secondly, of equal 
importance is the more complicated structure of the squared companion matrix while still 
keeping it relatively simple (only two columns). The simplicity of the companion matrix 
is an advantage when computing its eigenvalues, but it also means that linear algebra 
tools have less room to maneuver when trying to extract information about the location 
of the eigenvalues without actually computing them. As we will see, squaring expands the 
range of useful similarity transformations significantly, while also suggesting a natural and 
convenient reformulation of the zeros of a scalar polynomial as the eigenvalues of a matrix 
polynomial, leading to further advantages. 

Matrix polynomials are encountered when a nonzero complex vector v and a complex 
number z are sought such that P{z)v = 0, with 


P{z) — AnZ^ + An-lZ^ ^ + • • • + Aq , 


and the coefficients Aj are mxm complex matrices. If An is singular then there are infinite 
eigenvalues and if Aq is singular then zero is an eigenvalue. There are nm eigenvalues, 
including possibly infinite ones. The finite eigenvalues are the solutions of detP{z) = 0. If 
P is a monic matrix polynomial, i.e.. An = /, then its eigenvalues are the eigenvalues of 
the nm x nm block companion matrix C{P), dehned by 


C{P) 


/O 

I 


—Aq \ 
-^1 


\ I —An-lJ 


Since the size of I will usually be clear from the context, we omit it from the notation. 
Theorem 11.11 and Theorem O have analogs for matrix polynomials, and they are stated 
next. The matrix norms are assumed to be subordinate (induced by a vector norm). 

Theorem 1.4. (Cauchy’s theorem - matrix version) [77]/ ) All eigenvalues 

of the matrix polynomial P{z) = Iz^ + An-iz'^~^ + • • • + Aiz + Aq, where n > 2 and 
Aj S (27™'’^”*, for j = 0,... ,n, lie in \z\ < s, where s is the unique positive solution of 

- ||Ai||x-||Ao||=0. 
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Theorem 1.5. (Pellet’s theorem - matrix version.) (J^, fTTf ) Let 

P{z) = Iz^ + An—lZ^ ^ + • • • + AiZ + Aq 

be a matrix polynomial with n >2, Aj G fg-p j = 0,... ,n — 1, and Aq / 0. Let A^ 

he invertible for some k with 1 < k < n — 1, and let the polynomial 

have two distinct positive roots xi and X 2 with xi < X 2 - Then det(P) has exactly km zeros 
in or on the disc \z\ = xi and no zeros in the annular ring xi < \z\ < X 2 - 

Finally, we mention a theorem, which we call the Block Gershgorin theorem, due to 
D.G. Feingold and R.S. Varga ([!]). 

Theorem 1.6. (Block Gershgorin theorem) Theorems 2 and 4j) Let A be any 
n X n matrix with complex entries, which is partitioned in the following manner: 


Mbi 

^1,2 • 

Ai,n^ 

^2,1 

^2,2 • 

A2,N 


An,2 ■ 

■ An,n j 


where the diagonal submatrices Ai^i are square of order Ui, 1 < i < N, and let he the 
Ui X rij identity matrix. Then each eigenvalue X of A lies in a Gershgorin set Gj for at 
least one j, 1 < j < N, where Gj is the set of all complex numbers z such that 

1 ^ 

— ^ ll^i,fcll • 

If the union H = Uj=i ^ ^ Pj ^ XI, of k Gershgorin sets Gj is disjoint from the 
remaining N — k Gershgorin sets, then H contains precisely eigenvalues of A. 

The norms in this theorem are subordinate and the theorem has, just like Gershgorin’s 
theorem, also a block-column form. The sets Gj are difficult to compute in general, but 
each is a union of Uj discs when the Euclidean norm (2-norm) is used and the diagonal 
blocks are normal matrices m Theorem 5]). When the diagonal blocks are diagonal 
matrices, the sets are discs for any subordinate norm. 

Lower bounds on the moduli of polynomial zeros can be obtained by applying the ap¬ 
propriate aforementioned theorems to the reverse polynomial of p, defined by p^{z) = 
z'^p{llz)lao, whose zeros are the reciprocals of those of p. 

In the next section, we consider several similarity transformations of C‘^{p) that will be 
used in Section 3 to derive Gauchy-like results as in Theorems 1 1.1 1 and 1 1.41 and in Section 4 
to derive similar results to those of Theorems o and [T31 We present numerical results 
in Section 5 to illustrate the (sometimes drastic) improvements we were able to obtain. 
An effort was made to make statements of theorems self-contained with the inevitable 
repetition of some definitions. 
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2 Preliminaries 


Throughout, we denote by O (a; r) and O (a; r) the open and closed discs, respectively, 
centered at a with radius r. Consider the polynomial p{z) = + b^_iz^~^ + • • • + biz + 6o- 

If it is not of even degree, then we consider zp{z), which has a zero constant term. The 
latter has no effect on theoretical results, nor does it affect, as we will see, numerical 
results. To cover both cases at once, we define m = [|]j JT' = 2m, and the complex 
numbers aj as follows: 


ttj = bj {£ is even and j = 0,..., n — 1), 

no = 0 and aj = bj_i {i is odd and j = 1,..., n — 1). 


This means that, when i is odd, we have n = i + I and m = (£ + l)/2. When I is even, 
n = £ and m = £/2. From here on, we consider the polynomial p{z) = z” + + 

• • • + aiz + oo of even degree n. From ([2|), the square of the companion matrix of p can 
be written as 


C\p) 


/O 0 
0 0 
1 0 


0 -Oo 
0 —oi 

0 -02 


On—1^0 
an-ioi — Oo 
an-l«2 — 0,1 


(0 

I 


—^0 \ 
-^1 


Vo 0 


1 


^n—1 


a 


2 

72—1 


^n—2 / 


\ 




where 


An = 


-Oo 

-oi 


On-lOo 
On-lOl — Oq 


and Ai 


— 02 j On-ia 2 j — a 2 j-l\ r • 1 i 

■' J J I tor J = 1,..., m — 1. 

—02j+l On-ia2j+l — 02j J 


By Schur’s theorem, there exists a unitary matrix U that triangularizes the matrix 


Am—l 


[ ^n—2 ^n—l^n—2 ^n—3 

\ ^n—1 1 ^n—2 


i.e., 


U*A,n-lU = 


a 7 
0 /3 


where a, f3,'j £ (D, a and /3 are the eigenvalues of A^-i, and U* = U~^ is the Hermitian 
conjugate of U. If Am-i is diagonalizable, then there exists a nonsingular matrix M for 
which 


M-^Am-iM = 


a 0 

0 /3 


In what follows, the matrix S is defined as a matrix that either triangularizes or 

diagonalizes it if that is possible. 

We now consider three similarity transformations of C'^(p) together with their Gersh- 
gorin column sets, that will form the building blocks for the Cauchy-like and Pellet-like 
results of the following sections. 


5 






Let Q be an n X n block diagonal matrix with m = nj^ identical diagonal blocks equal 
to S. Then we dehne Cg{p) = Q~^C'^{p)Q, and, with ([2|), obtain 


Clip) = 


/O 

I 


S-^AoS \ 
S-^AiS 

I S-^A^_iSj 


where the vectors v,w £ (T"' ^ are defined by 

( = 5-M 

\v2j+l W2j+lJ 



0 

^^0 

wo\ 

0 

0 

Vl 

Wi 

1 

0 

V 2 

W 2 

0 

1 

^^3 

W 3 

0 

0 

10a 

7 

VO 

0 

0 1 0 


for 

II 

0 

.,m - 2 , 



and ce, /3 ,7 £(U with 7 = 0 if Am-i is diagonalizable. The triangularization (or diagonal- 
ization) of the lower right-hand block of C'^(p) will facilitate the application of Gershgorin’s 
theorem. 

To Cg{p) we apply two different similarity transformations. First, let be the diag- 


onal matrix with diagonal (x"', x 

n —1 

...,x) 

and dehne 



(0 

0 

Vo/x^~^ 

Wo/x^~^ 


0 

0 

1 

CO 



x^ 

0 

to 

s 

1 

W 2 lx^~'^ 

Fxip) = D-^CUp)Dx = 

0 

x^ 

Us/x"' ® 

W 3 /x'^~^ 


0 

0 

x^ 0 a 

7/x 



0 

0 x^ 0 

/3 


Then for any x > 0, the Gershgorin column set of Fx{p) is the union of the three discs 
Oi(x) = 0(0; x^), 02 (x) = 0 (a;/ 3 i(x)), and 03 (x) = 0{I3; p 2 {x)), where 


Piix) = 


On—31 , On— 4 I 


X 


X^ 


and 


, ^ 7 , O^n-3 , O^n-4 

P2{x) = -^- 5 -h 


X X^ x^ 


Oi Oo 

3 x^~‘^ ’’ 

(4) 

J 1 1 l«^il \wo\ 

' ' rpn —2 1 

(5) 


As X varies, the discs expand and contract, and we will use this flexibility later to obtain 
convenient configurations of the discs. 

Secondly, let A^, be the block diagonal matrix with diagonal blocks (x”*/, ..., xJ), 
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where I is the 2x2 identity matrix and a; > 0. We define 


= A-^Cl{p)A, 



0 

uo/x”^“^ 

WqIx'^' 


0 

0 


Wllx'^~ 

-1 

X 

0 

?; 2 /x™'“^ 

W 2 / X™” 

-2 

0 

X 

us/x™”^ 

W 3 / X™” 

-2 

0 

0 

X 0 a 

7 


\0 

0 

Ox 0 

/3 

/ 


( 6 ) 


The Gershgorin column set of ^xip) is the union of the three discs 0'i{x) = 0(0;x), 
02 (x) = 0(a;iTi(x)), and O'^(x) = 0(/3; a 2 (x}), where 


and 


<71 (x) 

0-2 (x) 


Tn—sl T Tn—4I ^ 5I T Tn— gI ^ ^ 


X^ 


I I sl T I'^n—4I I I’U^n—5I T I’U^n— gI 

bl + --- + -3- 


l^ll + l^ol 

j^m —1 ’ 

I I kil + lwol 

j^m—1 


(7) 

( 8 ) 


We define 


Tm-i — 


a 7 
0 /3 


Tj = S ^AjS for j = 0,m — 2 , 


and, for any subordinate matrix norm, 


t(x) =-^-^-^-+ 


\Tr 


01 


X 


X 


,m— 1 


If Am-i is diagonalizable, we choose the matrix S such that 7 = 0 and the Block Gersh¬ 
gorin column set of ^x{p) is then given by Oi U 02 , where the sets Gj are as defined in 
the statement of Therorem ll.61 It is straightforward to show that Gi = 0(0; x) and that 


02 = 






{z : max \z — a\,\z — I3\ < t(x)} . 


This means that O 2 = 0(a;r(x)) U 0(^0; r(x)), the union of two discs centered at a and 
/3, respectively, with identical radii t(x). For this theorem, it is important for Am-i to be 
diagonalizable since the inclusion region would otherwise become too complicated to be 
useful. 

Remarks. 

• One observes that 0^(p) is the block companion matrix of the matrix polynomial 
P{z) = Iz™ -|- Am-iz'^~^ + • • • + Aiz + ^ 0 ) be., the squares of the zeros of p are also 
the eigenvalues of P. It follows that they are also the eigenvalues of 

Ps{z) = Iz^ + T^-iz^-^ + • • • + Tiz + To . (9) 
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• Since it simplifies all the equations we will encounter, we will diagonalize 

whenever this is possible. It is a straightforward exercise to determine when A^-i 
is not diagonalizable in terms of the coefficients of p, since it can be written as 


A 


m—1 


I ^n—2 ^n—l^n—2 ^n—3 

\ ^n —1 ^n —1 ^n —2 


— 0 ‘n- 2 l + 



^n —2 


a 


2 

n—1 



Excluding Un-i = cin-s = 0 (which makes Am-i diagonal), Am-i cannot be di¬ 
agonalized if it has a double eigenvalue. Since the characteristic polynomial of 
Ajn-i + an- 2 l is given by 

^ 1"^ On—1 (®n—1071—2 0^1—3) , 


the matrix Am-i is not diagonalizable when 

On—l(o,,^_3 4cin—lOn—2 dOn—3) — 0 j 


unless On-l = On-S = 0 . 


• As always when deriving bounds, an eye should be kept on the computational cost 
this entails, as this cost should remain well below the cost to compute the zeros 
exactly. This is certainly the case here, as the similarity transformations and the 
matrix norms, involving at most 2 x 2 blocks, require a total 0 {n) arithmetic opera¬ 
tions, as do the solutions of the various real polynomial equations. The latter should 
require few iterations when a properly adapted iterative method is used. 


3 Cauchy-like results 


We now present a theorem containing two Cauchy-like results for the moduli of a polyno¬ 
mial’s zeros, using similarity transformations of the squared companion matrix. 

Theorem 3.1. For a polynomial p{z) = biz -|- 6 o with complex 

coefficients, i>3, and zeros define m = |~|], n = 2 m, and the complex numbers 

Oj as follows: 


Oj = bj (I is even and j = 0 ,... ,n — 1), 

oo = 0 and aj = bj-i (i is odd and j = 1 ,... ,n — 1). 

Furthermore, define 



®n— 1®0 
an-ifli — OO 


and Aj 


( —a 2 j an-ia 2 j — a 2 j-i 

\ (^ 2 j+l ®Ti—® 2 j 


for j = - 1 , 


and let S be a nonsingular matrix such that S ^Am-iS 
Define the vectors v,w ^ (27"'“^ as 


a 7 
0 fi 


where a, (5,^ G (E. 


V2j W2j 
V2j+1 'W2j+1 


S ^AjS for j = 0, ...,m — 2 . 


Then the following holds. 

(a) \zj\ < max{ri,r 2 }, where ri and r 2 are the unique positive solutions of ipi{x) = 0 and 

= 0, respectively, given by 

iJlix) = x"' — |q:|x"'“^ — \Vn- 2 \x'^~^ — ... — \vi\x — |xo| , 

and 

'Ip2{x) = x'^^^ - - |7|x”“^ - \wn-^\x^~^ -|wi|x - |u;o| . 

1 /*? 

(b) | Zj\ < (max{si, S 2 }) , where si and S 2 are the unique positive solutions ofipi{x) = 0 

and ^ 2 {x) = 0, respectively, given by 

<Pl{x) = X™ - |a|x”*"^ - (|Xn-3| + |Xn_4|)x™“^-(|X 3 | + |x 2 |)x-(|xi| + |xo|) , 

and 

ip 2 {x) = X™ - (|/3| + - (kn-sl + \Wn-i\)x'^~‘^ -(|'U; 3 | + \w 2 \)x - {\wi\ + l-wol) . 

Proof. 

(a) First assume that n is even, in which case m = n = i, and Oj = bj. Define Cg{p) 
as in m, and for x > 0 let Dx be the diagonal matrix with diagonal (x"',x”' ^,... ,x). 

Define Fx{p) = Df^Cg{p)Dx, so that, with ([3]), the Gershgorin column set of Fx{p) is the 
union of the three discs Oi(x) = 0(0; x^), 02 (x) = 0(a;pi(x)), and Oslx) = 0{f5; p 2 {x)), 
where pi and p 2 are dehned by (jl]) and m, respectively. As x increases, Oi(x) expands, 
while 02(x) and 03 (x) contract. When x^ = |a| + pi{x), Oi(x) and 02(x) will be tangent 
to one another and 02(x) C Oi(x). This happens when x = ri, where ri is the unique 
positive solution of 

x” — lalx""”^ — \Vn- 3 \x'^~^ — • • • — |ui|x — |uo| = 0 . 

If 03 (ri) C Oi(ri), then the Gershgorin set is Oi(ri). If this is not the case, we let x 
increase further, until Oi(x) becomes tangent to O^lx) and Oslx) C Oi(x). This occurs 
when x^ = |/3| + P 2 {x), which is when x = r 2 , where r 2 is the unique positive solution of 

x”+i - |/3|x”-^ - lylx'^-^ - \wn-3\x^~^ -|7Xi|x - |u;o| = 0 . 

The Gershgorin set of Fx{p) is then equal to Oi(r 2 ). The case where Oi(x) hrst becomes 
tangent to Oslx) is analogous. We conclude that the Gershgorin set of Fx(p) is given by 
0^0; (max{ri, r 2 })^^ . This means that, for any of the zeros zj ofp, \zj\‘^ < (max{ri, r 2 })^, 
which concludes the proof of part (a) when i is even. When £ is odd, we multiply p by 
z, which makes it a polynomial of even degree with an added zero at the origin and with 
m = |~|]. The coefficients aj of zp{z) are then as defined in the statement of the theorem, 
with ao = 0. The latter is of no consequence and the proof of part (a) then follows from 
the even case. 

(b) Here too, we first assume that I is even. We then proceed similarly as in part (a) and let 
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Aj. be the block diagonal matrix with diagonal blocks (x™'/, , xl), where I is the 

2x2 identity matrix. We define ^x{p) = ^x^Cg{p)Ax, so that, with ([6]), the Gershgorin 
column set of ^x{p) is the union of the three discs 0'i{x) = 0(0; x), 02(x) = 0(a;(Ti(x)), 
and 0'^{x) = 0(/3; cr 2 (x)), where cJi and £72 are defined by d?]) and (l8|), respectively. As x 
increases, 0^(x) expands, while 02(x) and 03 (x) contract. When x = |a| + cri(x), O'^(x) 
and 02(x) are tangent to one another and ©^(x) C 0[{x). This occurs when x = si, 
where si is the unique positive solution of 

X™ - \a\x'^~^ - (|x„_3| + \Vn-A\)x'^~^ -(Ixsl + |x2|)x-(|xi| + |?;o|) = 0 . 

On the other hand, 0'i{x) and 0!^{x) are tangent to one another and ©^(x) C O'i(x) when 
X = |/3| + (T 2 (x), which happens when x = S 2 , where S 2 is the unique positive solution of 

x™-(|/3| + |7|)x™“^- (|u;n-3| + kn-4|)3:”*“^-(11(^31 +|'»^ 2 |)a;-(|u;i| + \wo\) = 0 . 

From here on, the proof proceeds analogously to the proof of part (a) and we conclude 
that the Gershgorin set of ^x{p) is given by 0^0; maxjsi, 52 }^ • Gonsequently, we obtain 

for the zeros Zj of p that \zj\^ < max{si,S 2 }, which concludes the proof of part (b) for 
even L When i is odd, we consider zp{z) instead of p{z) and the proof follows from the 
even case, analogously as in part (a). □ 

One of the advantages of the triangularization of Am-i when applying Gershgorin’s 
theorem is made clear by the proof of part (a), where 02 {x) would otherwise not necessarily 
contract with increasing x if the (2,l)-element of were nonzero, as it would be 

multiplied by x after the similarity transformation. 

Although the solution of the real equations in this theorem requires only a fraction 
of the computational effort required to compute all the zeros of p, it is worth mentioning 
that this can be carried out very efficiently. Once, e.g., ri is computed, the sign of '02(?'i) 
immediately determines if it is larger than r 2 or not. If it is, '02(a^) = 0 need not be solved. 
An analogous situation exists for part (b). Moreover, it is computationally less expensive 
to solve two equations of degree n/2 than just one of degree n. 

Although it would lead us too far, more polynomial inclusion regions along the lines 
of the ones obtained in |10] or m Gorollary 8.2.3] can be derived here as well. Such 
regions are obtained when one of the discs centered at a or /3 is allowed to absorb the one 
centered at the origin. In addition, other special values for the parameter x might also be 
considered, such as, e.g., values for which two discs have the same radius. 

4 Pellet-like results 

It is sometimes possible to isolate one zero of a polynomial by applying Gershgorin’s 
theorem to C{p) (see, e.g. [lO]), and a similar approach can be applied here, although 
here it can lead to the isolation of one or two squares of zeros of the polynomial (and 
therefore to the isolation of the zeros themselves). This is reminiscent of special cases 
of Pellet’s theorem, whence the title of this section, although it does provide a smaller 
inclusion region than those provided by Pellet’s theorem, which lead to discs, annuli, and 
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(infinite) complements of discs. Very often, however, it is the very ability to isolate zeros 
that is important, rather than the size of the inclusion region. We formulate two theorems, 
based on the similarity transformations of (p) introduced in Section [5J To make them 
self-contained, their statements include some previously defined quantities. 

Theorem 4.1. For a polynomial p{z) = + bi-iz^~^ + • • • + biz + 6o with complex 

coefficients, £ > 3, define m = , n = 2m, and the complex numbers aj as follows: 


Oj = bj (£ is even and j = 0,... ,n — 1), 

oq = 0 and aj = (i is odd and j = 1,... ,n — 1). 

Furthermore, define 


An = 


-Oq 


On-iao 


-Oi Qn-iai — Oo 


and Aj = 


-a2j an-io>2j — «2j-l 


-a2j+i 


art, — 


lOj+i — 02 j 


for j = l,...,m - 1, 


and let S be a nonsingular matrix such that S ^Am-iS 
Define the vectors v,w £ (27"'“^ as 


a 7 
0 fi 


where a, /3,7 G (E. 


f V2j W2j \ 
\V2j+l W2j+l) 


S ^AjS for j = 0, ...,m — 2 . 


For X > 0, let 

Pi{x) 

P2{x) 

Xi{x) 

X 2 (x) 


sl 4I I'^ll |^o| 

^ ^ ^2 r • • • + j,n-3 ^n-2 ’ 

I7I |u;„-3| \Wn-4\ |w^l| \wo\ 

^ + ■ ■ ■ + n-2 ™n-l ’ 

vt/ Jb Jb Jb Jb 

x" - H-h |ui|x -I- |?;o| , 

- |/3|x”"^ -k |7|x”“^ + \wn-3\x'^~^ H-h |rci|x + l^ol , 


and 

ai{x) 

(X2{x) 

a;i(x) 

UJ2{x) 


1 — 3 ! T I'^n— 4 I 5 I T l^n— gI 


X 


X^ 


■••• + 


l^ll + |t^o| 


X 


■m—1 


^ \Wn- 3 \ + \Wn- 4 \ [tCn-sl + l^n-el _^ 

X x"’ 


|wi| + \wo\ 


X x™~^ ' 

+ (|^^n-3| + |x„_4|)x™“^ H-h (|U3| + |x2|)x H-h (|xi| + |xo|) , 

- lo/l 4- f I l/L. n I 4- I l/L. . li 4- . . . 4- fll/Inl 4- ll/Ir.l'i'T 4- . . . 4- (I'llU 


— |a|x 

- fl/31 - hl)x”*-^ ^ M.,. 


Define for x > 0; 

Oi(x) = 0(0;x^) , 02(x) = 0(a;pi(x)) , 03(x) = 0(/3;p2(x)) , 
0[(x) = 0(0;x) , 02(x) = 0(a;ai(x)) ; 03(x) = 0(/3;cr2(x)) . 
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IfXi{x) = 0 has positive solutions ti and t 2 , set Ii = [ti,t 2 ], otherwise /i = 0. Ifx 2 {x) = 0 
has positive solutions ui and U 2 , set I 2 = [ui,U 2 ], otherwise I 2 = 0. If uji{x) = 0 has pos¬ 
itive solutions ii and ^ 2 ; set Ji = [ii,i 2 ]; otherwise Ji = 0. If uj 2 {x) = 0 has positive 
solutions ui and U 2 , set J 2 = [ui,U 2 ], otherwise J 2 = 0. Then the following holds. 

(al) ///in /2 / 0, set pii =max{ti,ni} and fi 2 = min{t 2 ,'^ 2 }- Then i—2 of p’s zeros lie in 
0(0; fii), while the union O 2 (/ 0 i(/X 2 ))UO 3 (p 2 (/^ 2 )) contains the squares of the two remaining 
zeros, i.e., p has no zeros with a modulus between pLi and (min{|a| — |/3| — ■ 

If, in addition, \a — f3\ > pi{p 2 ) + P 2 {p 2 ), then 02 (^ 2 ) and 03 (^ 2 ) each contain one square 
of a zero of p. 

(a2) If Ilf] I 2 = 0, we have the following. 

• // /i / 0 and \a — I3\ > pi{t 2 ) + P 2 {t 2 ), then the squares of i — 1 of p’s zeros are 
contained in the union Oi{t 2 ) U 03 {t 2 ), while the remaining square of a zero lies in 
02 ( 12 ). 

• // /2 / 0 and |a — /3| > pi{u 2 ) + P 2 (u 2 ), then the squares of £ — 1 of p’s zeros are 
contained in the union Oi(u 2 ) U 02 (^ 2 ), while the remaining square of a zero lies in 
03 ( 7 x 2 ). 

(bl) If Ji n J 2 / 0, set pi = max{ti, til} and fL 2 = min{t 2 , ^ 2 }. Then i — 2 of p’s zeros lie 
in 0(0; while the union 02 ((Ti{fi 2 ))'^ 0 '^(a 2 (p 2 )) contains the squares of the two re¬ 

maining zeros, i.e., p has no zeros with a modulus between y/pl and (min{|a| — cri(p 2 ), |/3| — (r 2 (p 2 )'\) 
If, in addition, |a—/3| > (Ti(/i 2 )+C 2 (/i 2 ); thenO[{}l 2 ) and 02 (^ 2 ) ^CLch contains one square 
of a zero of p. 

(b2) If Ii n /2 = 0; we have the following. 

• // Ji / 0 and |a — /3| > (Ti(t 2 ) + < 72 (^ 2 ): then the squares of i — 1 of p’s zeros are 
contained in the union 0 ((t 2 ) U 0 '^(t 2 ), while the remaining square of a zero lies in 

0'2(i2). 

• If J 2 ^ o-nd |a — /3| > (Ji{u 2 ) + < 72 ( 7 x 2 )? then the squares of i — 1 of p’s zeros are 
contained in the union 0[(u2) U 02 ( 7 x 2 )? while the remaining square of a zero lies in 
0'^(U2). 

Proof. 

(al) When £ is even, m = ^/2, n = £, and Oj = bj. In the Gershgorin colnmn set for 
Fx{p), dehned by ([3]), Oi(a;) is disjoint from the other two discs if 

|a| > + pi(x) <;=^ Xi < 0 and |/3| > + p 2 (x) X 2 < 0 . 

This can only happen if Ii n I 2 is not empty, in which case Ii n /2 = [pi,p 2 ], where pi 
and p 2 are defined in the statement of the theorem. By Gershgorin’s theorem, Oi(x) 
then contains n — 2 squares of zeros of p, while 02(x) U 03 (x) contains the remaining 
two for any x satisfying pi < x < p 2 - This is therefore true for the intersection of all 
these Gershgorin sets as x runs from pi to p 2 , which is given by the disjoint sets Oi(pi) 
and 02 (^ 2 ) U 03 (^ 2 ). As a consequence, and because 02 (x) and Oslx) are centered at 


1/2 


12 


a and 13, respectively, no square of the zeros of p can have a modulus between pf and 
min{|a| - pi{p 2 ),\( 3 \ - P 2 {fJ- 2 )}- If |a-;8 | > pi{p 2 ) + P 2 {P 2 ), then 02 (^ 2 ) and 03 (^ 2 ) are 
also disjoint from each other, and by Gershgorin’s theorem must each contain a square of 
a zero of p. For odd n, we consider zp{z) instead of p{z), as in the proof of Theorem 13.11 
and then proceed as in the even case. This proves part (al). 

(a2) Assume that i is even. If Ii D I 2 = 0, then there are no values of x for which both 
02 {x) and 03 {x) are disjoint from Oi(x). When Ii / 0, then the smallest radius 02 ix) 
can have while being disjoint from Oi is x = P 2 {t 2 )- If 02 (^ 2 ) and 03 (f 2 ) are disjoint, i.e., 
when \a — I3\ > ^ 1 (^ 2 ) + P 2 {t 2 ), then by Gershgorin’s theorem, 02 (^ 2 ) contains exactly one 
square of a zero of p, while the other n — 1 are contained in Oi(t 2 ) U 03 ( 12 )■ The proof of 
the analogous situation, where the roles of Ii and I 2 are switched, then also follows. The 
case where i is odd is treated as before. This proves part (a2). 

(bl) and (b2) The proof of parts (bl) and (b2) follows the exact same pattern as that 
of parts (al) and (a2), except that here the radius of 0'i{x) is x instead of x^. All other 
aspects are analogous. This concludes the proof of the theorem. □ 

We note that \j3\ > I 7 I is a necessary condition for a; 2 (x) = 0 to have positive solutions. 

The previous theorem, which focuses on the isolation of zeros, but does not try to 
optimize the inclusion sets, can be enhanced in several ways, depending on the situation. 
As an example, consider the case (a2), where /2 / 0 and \a — j3\ > pi{u 2 ) + P 2 {u 2 )- If 
02 ( 1 * 2 ) happens to be contained in Oi{u 2 ), then there exists a value u* < U 2 for which 
02 ( 1 **) still lies inside Oi(tt*), but is tangent to it. In that case, the Gershgorin column 
set of Fx{p), given by the union of the two disjoint sets 03 (^ 2 ) and Oi (max{ui,«*}), 
is smaller than the one in the theorem. In fact, in this case, even when 02 ( 1 * 2 ) does 
not lie inside Oi(u 2 ), Oi(u) U 02 ( 1 *) for some values u < U 2 may be smaller than the 
corresponding set in the theorem. Other cases may be similarly improved. Figure [U shows 
an example of case (a2) as we just described with u* < up. on the left is the enhanced 
Gershgorin set, while the Gershgorin set from the theorem is shown on the right. It was 
obtained for the polynomial 2 ® + (4 + ‘3i)z^ + (3 — i)z^r + 4 + 2i)z^ — (3 + 2i)z'^ — ^z + A + i. 
Figure [ 2 ] shows a similar example, where this time u* > 1 * 1 , obtained for the polynomial 
z® — 2iz^ + (3 + Ai)z‘^ + (3 + i)z^ — (2 + i)z‘^ + 2iz + 2 + h The asterisks in the hgures 
indicate the squares of the zeros of the polynomials. 
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Figure 2: Enhancement of case (a2) with u* > ui. 


In the introduction, we mentioned that Theorem 11.11 can be obtained by applying 
Gershgorin’s theorem to a similarity transformation of C{p), and it can be shown that 
similarly applying the Block Gershgorin theorem to Cg{p) is equivalent to applying its 
matrix version, namely. Theorem oi to the matrix polynomial Ps, defined by ([9]), when 
Ara-i is diagonalizable. However, for Pellet’s theorem, applying the Block Gershgorin 
theorem to Cg{p) leads to a more subtle result than its matrix version, which is the next 
theorem. For this theorem we will assume Am-i to be diagonalizable. 

Theorem 4.2. For a polynomial p{z) = + • • • + biz + bo with complex 
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coefficients, i > 3, define m = , n = 2m, and the complex numbers Oj as follows: 

Oj = bj (I is even and j = 0,... ,n — 1), 

oq = 0 and Oj = (i is odd and j = 1,... ,n — 1). 


Furthermore, define 
Aq = 


oo O-n-lO-O 
oi CLn-iai — ao 


and Aj = 


—0‘2j an-ia2j — a2j-i 
~(^2j+l l®j + l 0,2j 


for j = 1, 


and let Am-i be diagonalizable by a matrix S such that S ^A^-iS = 

a, fi £ (U and |a| < |/3|. Define 
the complex vectors v and w as 


a 0 
0 fi 


where 


a,j3 £(C and |a| < |/3|. Define the 2x2 matrices Tj = S ^AjS (j = 0,... ,m — 1) and 


For X > 0, let 

t{x) 

Di{x) 

D2{x) 


f V2j W2j \ 
\V2j+l W2j+l) 


Tj for j = 0 , ...,m - 2 . 


||?m-2|| ll^m-sll ||ro|| 

X x^~^ ^ 

X™ - |a|x™-i + ||T™_2 ||x"*-2 + ... + ||Ti||x + • • • + IIToII , 
1 + ||t^_2||x”^- 2 + ... + llTillx + • • • + IIToII , 


and define 


0"(x) = 0(0; x) , 02 (x) = 0(a;r(x)) ; 0'^{x) = 0(/3;t(x)) . 


If Qi{x) = 0 has positive solutions xi and X 2 , set Ki = [xi,X 2 ], otherwise Ki = 0. If 
D 2 {x) = 0 has positive solutions yi and y 2 , set K 2 = [yi,y 2 ], otherwise K 2 = 0. Then the 
following holds. 

(a) If Ki / 0, then K 2 / 0, C K 2 , and 1 — 2 of p’s zeros lie in 0'{{y/xi), while the 
union 02 (r(x 2 ))u 03 (T(x 2 )) contains the squares of the two remaining zeros, i.e., p has no 
zeros with a modulus between ^Jx\ and (|a| — r(x 2 ))^^^. If, in addition, \a — j5\> 2 t(x 2 ), 
then 02 (x 2 ) and 0 'f{x 2 ) each contains one square of a zero of p. 

(b) If Ki = tit and K 2 / 0, then if \a — j3\ > 2T{y2), the squares of i — 1 of p’s zeros 
are contained in the union 0 "(y 2 ) U 02 {y 2 ), while the remaining square of a zero lies in 

0'ffiy2). 

Proof. The proof is similar to the proof of Theorem 14.11 with minor differences. 
Assnme first that I is even. Here we apply Theorem 11.61 the Block Gershgorin theorem, 
to < 1 > 3 ; [p) , defined in ([ 6 ]) . As we showed in Section [21 this prodnces the block Gershgorin 
colnmn set Gi U 02 , with the sets Gj as defined in the statement of Therorem 11.61 There 
we saw that Gi = 0'/(x) and that G 2 = O'ffix) U 0'f{x), the union of two discs centered 
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at a and 13, respectively, with identical radii t{x). 0'{{x) is disjoint from the other two 
discs, if 


|q:| > a; + t{x) ni(x) < 0 and |/3| > x + t{x) fl 2 {x) < 0 . 

Clearly, because |a| < \I3\, if ni(a:) = 0 has two positive solutions xi and X 2 , then ^ 2 ( 2 ^) = 0 
also has two positive solutions yi and y 2 , with yi < xi < X 2 < y 2 - From here on, the 
proof follows analogously to that of Theorem 14.11 The case when t is odd is treated 
analogously. □ 

Similar enhancements of this theorem can be obtained as for Theorem 0 
The theorems in this section derive inclusion regions that are sometimes given in terms 
of the squares of the zeros of p. These results are easily translated to bounds on and gaps 
between the moduli of the zeros of p. However, the inclusion sets themselves are slightly 
more complicated. If a disc, centered at c with radius R, contains the squares of the zeros of 
a polynomial, then any zero z satisfies \z‘^ — c\ < R, which implies that \z+y/c\\z — ^/c\ < R, 
i.e., they lie in a region bounded by an oval of Cassini with foci -iiy/c. As illustration let 
us consider a situation where the squares of the zeros of a polynomial are contained in the 
union of a disc, centered at the origin with radius Ri, and another disc, centered at c with 
radius R 2 - Figure [3] shows such discs, containing the squares of the zeros with c = 6 + 6i, 
Ri = 4, and R 2 = 3, while the corresponding inclusion region for the zeros themselves - 
the union of a disc and an oval of Cassini (consisting of two loops because the disc centered 
at c is bounded away from the origin) - are shaded in gray. 

We conclude by pointing out that a similar approach has the potential to improve 
analogous results for matrix polynomials, especially when the matrix coefficients are of 
moderate size compared to the degree of the polynomial, in which case one can argue 
that the corresponding block companion matrices also have a diagonal mostly composed 
of zeros; squaring them may lead to smaller inclusion regions there as well. 
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Figure 3: Inclusion regions for the squares of the zeros (circles) and the zeros themselves 
(shaded). 


5 Numerical comparisons. 

In this section we illustrate our results numerically, while comparing them to the classical 
Theorems o and O To do so, we created sets of random polynomials, to which, for 
the Cauchy-like results of Section [3l we applied Theorem ll.il Theorem 11.41 applied to the 
matrix polynomial Ps defined by ([9]), and parts (a) and (b) of Theorem 13.11 while for 
the Pellet-like results of Section 0] we compared Theorem 11.21 with k = 1,2, Theorem 11.51 
applied to the aforementioned matrix polynomial Ps with A; = 1, all parts of Theorem 14.11 
and Theorem 14.21 For the Cauchy-like results, we compared the averages of the ratio of 
the upper bounds and the modulus of the largest zero, i.e., the closer this number is to 
one, the better the bound, and we also recorded the number of times each method gave 
the best upper bound on the moduli of the zeros. For the Pellet-like results, we compared 
the number of times zeros (or squares of zeros) could be isolated from the others for 
each method, which is generally the most important use of these methods. We chose the 
Euclidean norm (2-norm) when applying Theorem 11.61 and the diagonalizing matrix S was 
chosen to have normalized columns. No polynomials were generated where Am-i was not 
diagonalizable, and its eigenvalues a and /3 as they appear in all our results were ordered 
so that |a| < 1/3|. No special choice of coefficients was made in Sets 1-2, but Sets 3-4 
exhibit specihc relative magnitudes of some coefficients to better illustrate the advantages 
of our methods. 

The sets of polynomials are dehned below, with n indicating the degree of the polyno¬ 
mials. 

Set 1: n=20, the leading coefficient is one and the other coefficients have real and imagi¬ 
nary parts that are uniformly randomly distributed on the interval [—2,2]. 

Set 2: n=20, the leading coefficient is one and the other coefficients have real and imagi- 
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nary parts that are uniformly randomly distributed on the interval [—4,4]. 

Set 3: n=20, the first four coefficients are 1,2, 6 ,2 and the other coefficients have real and 
imaginary parts that are uniformly randomly distributed on the interval [—4,4], 

Set 4: n=20, the first four coefficients are 1,2, 8 ,2 and the other coefficients have real and 
imaginary parts that are uniformly randomly distributed on the interval [—4,4], 

For each set we generated 1000 random polynomials and collected the results in Tabled] 
for the Cauchy-like methods and in Table [2] for the Pellet-like methods. In Table dl the 
methods are listed across the top, and in an entry of the form 7 /j, 7 is the average ratio 
of the upper bound to the modulus of the largest zero, while j is the number of times 
that a particular method delivered the best ratio. It is clear from these results that the 
classical result by Cauchy (Theorem II.Ih is almost always worse than the other methods. 
In Tabled] the methods are listed as in Tabled] and in an entry of the form ijjjk, i is 
the number of times two zeros could be isolated not only from the n — 2 remaining ones, 
but also from each other, j is the number of times two zeros could be isolated from the 
other n — 2 ones, but not from each other, and k is the number of times a single zero could 
be isolated. For Pellet’s theorem (first column), in an entry of the form i/j, i and j are 
the number of times one and two zeros could be isolated, respectively, from the remaining 
zeros. For Pellet’s theorem’s matrix version (second column), we listed the number of 
times two zeros could be isolated from the remaining n — 2 . 

Our Pellet-like methods appear to be more sensitive, i.e., better able to isolate zeros, 
and do not seem to require as large a difference between the magnitudes of appropriate 
coefficients as is the case for Pellet’s theorem. Moreover, the zero inclusion regions defined 
by Pellet’s theorem are cruder than the results derived here. The difference with Pellet’s 
theorem can be quite dramatic, as for Set 1, where Pellet’s Theorem was able to isolate 
zeros in only one case, compared to more than 140 cases for our methods, and also Set 2 , 
where it was able to isolate zeros for only 119 cases as opposed to more than 580 for our 
methods. A similar observation holds in the case of the isolation of two zeros for Set 3 
and Set 4. 

We observed that the results for all sets of polynomials did not seem sensitive to the 
degree of the polynomial, delivering very similar results when the degree was doubled nor is 
there any appreciable difference between even and odd degrees. However, they are sensitive 
to the range of the real and complex parts of the generated random polynomials for the 
Pellet-like results: the larger the range, the better the results, as it caused larger differences 
between the magnitudes of the coefficients, thereby increasing the likelihood that zeros can 
be separated. The effect of this on the Cauchy-like results was not significant. 

We note that in m the matrix version of Pellet’s theorem was applied to but 

not C‘g{p), resulting in a worse performance there. 
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Theorem 11.11 
(Cauchy) 

Theorem 11.41 
(Matrix Cauchy) 

Theorem 13.11 (a) 

Theorem 13.11 (b) 

Set 1 

1.26 / 8 

1.11 / 401 

1.11 / 529 

1.13 / 62 

Set 2 

1.23 / 11 

1.07 / 200 

1.06 / 757 

1.08 / 32 

Set 3 

1.58 / 0 

1.10 / 974 

1.15 / 17 

1.11 / 9 

Set 4 

1.48 / 0 

1.06 / 991 

1.10 / 3 

1.07 / 6 


Table 1: Cauchy-like results - ratio upper bound to modulus and number of times the 
bound outperformed the others. 



Theorem 11.21 
(Pellet) 

Theorem 11.51 
(Matrix Pellet) 

Theorem 14.11 (a) 

Theorem 14.11 (b) 

Theorem 14.21 

Set 1 

1/0 

0 

0/0/203 

0/0/145 

0/0/214 

Set 2 

119 / 0 

0 

0 /0/666 

0/0/585 

0/0/653 

Set 3 

0/0 

38 

4/0/345 

17 / 0 / 243 

38/0/0 

Set 4 

0/0 

976 

501 / 0 / 498 

908 / 0 / 90 

976/ 0 /0 


Table 2: Pellet-like results - number of times inclusion regions for one and two zeros can 
be found. 
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